v0.2 Status
Just started planning the changes
Status
- v0.2 uses json-to-mysql v1.0
- v0.3 uses json-to-mysql v2.0
TODO:
- clean up this document & remove duplicated notes
Where I'm At
- I wrote
-
convert
to do all conversions -
convert_to_csv
to convert .ods to csv -
clean_csv
to clean theconvert_to_csv
output -
convert_to_json
to convert CLEAN csv to json
-
- I copied in
-
clean_row
-
clean_value
-
- I wrote tests
-
testCleanCsv
to testclean_csv
method -
testCsvToJson
to testconvert_to_json
method
-
- I need to
-
convert_to_csv
: Consider moving theclean_csv
call OUT of it & instead doing this in theif ($clean_csv)
conitional ofconvert
- Alternate (or additional): Accept optional path(s) for the output file location(s)
- test
convert_to_csv
- write
convert_to_sql
(mostly just copy, then edit) - write
convert_to_sqlite
(mostly just copy)- probably should be a different name, bc it's just executing sql, NOT really converting.
-
TODO
- DONE Create a proper data/file structure for the tests with at least two data sources
- DONE integrate phptests & write a proper test case
- DONE Write
\Tlf\DataConverter
class - IN PROGRESS fill out methods one at a time from the
running it
section below - Update the README.src.md to make sure the example is correct
- run scrawl & push
- update default branch to v0.2
Notes from my phone (Oct 28, 2021)
- change json-to-mysql into general purpose data converter. Create a liaison app to facilitate it all via a data dir. Probably new Addon($package) (or new Compo(...) Until i update liaison. This enables data conversion & display for the given package.
- data lib new set up (add just one data source) or a dir containing multiple data sources. (Can one data source have multiple spreadsheets???). Structure: AppDir/data/ idph/ config.json - or can just use a global config for all the data dirs & allow per-source overrides source.[ods|csv|json] out.json out.csv etc... files/ ... Just associated files cdc/ source.ods ....
Goals
Quality of Life features
-
meta.json
file is optional initally (one will be created for you)
Running it
$converter = new \Tlf\DataConverter();
$converter->addSource($dir.'source-name');
$converter->addSource($dir.'source-name2');
$converter->convert();
$meta = $converter->meta('ns:source-name'); // return meta.json as array
$metaMd = $converter->metaMd('ns:source-name'); // return contents of meta.md
$metaMdPath = $converter->metaMdPath('ns:source-name'); // return path to meta.md file
$out = $converter->files('ns:source-name'); // return array of paths to files
$out == ['json'=>'/path/to/out.json', 'sql'=>'/path/to/out.sql', ...];
File structure:
source-name/
meta.json ## define source name, sqlite table name, offset of table headers, offset of first data row ... idk what else
meta.md ## a meta information file intended for human consumption / delivery on a web page
files/
... any relevant input files, such as images, source data, web pages, whatever is relevant
source.ods ## in v0.3 I want source to allow any of these file types
out.json
out.csv
out.ods ## just a copy of source.ods??
out.sql ## combines CREATE TABLE & INSERT statement files from json-to-mysql output
out.sqlite
out/
... output files from json-to-mysql
Maybe for v0.3
- Create a conversions log file, so we know every time a new conversion was done & maybe how many rows were added (or changed? Probably not diffing to that degree though)
- Allow custom offsets & stuff for source data files (currently has a very strict setup)
- start with any available format & convert to all the others
- Add extensible class that each data source CAN implement (perhaps in a
meta.php
file) to customize certain features
Code to consider
This is code used on a site of mine to help me display meta files on the web. I want to correlate a data name (like case/mchd-daily) with a set of data outputs
<section>
<?php
$view = $view_category .'/'. $view_name;
$map = [
'case/mchd-daily'=>'mchd-daily-cases',
'case/mchd-weekly'=>'mchd-weekly-cases',
'death/mchd-daily'=>'mchd-daily-deaths',
'death/mchd-weekly'=>'mchd-weekly-deaths',
'death/rates-by-county'=>'death-rates',
'hospital/mhs'=>'mhs-hospitalizations',
'hospital/hshs'=>'hshs-hospitalizations',
'idph/daily'=>'idph-cases',
'idph/weekly'=>'idph-cases',
'vacc/ByDay'=>'cdc-vaccinations',
'vacc/locations'=>'vaccine-locations',
'variant/mchd'=>'mchd-variants',
];
$view = str_replace('_', '-', $view);
if (!isset($map[$view])){
echo "There was an error.";
return;
}
$meta_file = $map[$view].'.md';
$root = $_SERVER['DOCUMENT_ROOT'];
$data_dir = $root.'/vendor/taeluf/data.macon-county-covid/data/';
$metaFile = $data_dir.$meta_file;
if (is_file($metaFile)){
echo "<markdown>\n"
.file_get_contents($metaFile)
."\n</markdown>"
;
}
echo $lia->view('covid:downloads', ['by'=>'source', 'filter'=>$map[$view]]);
echo $lia->phad($view, ['access.name'=>'all']);
?>
</section>